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Abstract: The draft nuclear genomes of Diplodia sapinea, Ceratocystis moniliformis s. str., and C. 
manginecans are presented. Diplodia sapinea is an important shoot-blight and canker pathogen of Pinus 
spp., C. moniliformis is a saprobe associated with wounds on a wide range of woody angiosperms and C. 
manginecans is a serious wilt pathogen of mango and Acacia mangium. The genome size of D. sapinea 
is estimated at 36.97 Mb and contains 13 020 predicted genes. Ceratocystis moniliformis includes 25.43 
Mb and is predicted to encode at least 6 832 genes. This is smaller than that reported for the mango wilt 
pathogen C. manginecans which is 31 .71 Mb and is predicted to encode at least 7 494 genes. The latter is 
thus more similar to C. fimbriata s.str, the type species of the genus. The genome sequences presented 
here provide an important resource to resolve issues pertaining to the taxonomy, biology and evolution of 
these fungi. 
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IMA Genome-F 2A 

Draft genome sequence of the pine 
fungal pathogen Diplodia sapinea 

INTRODUCTION 

Diplodia sapinea, also known as Diplodia pinea or 
Sphaeropsis sapinea (Phillips ef al. 2013), was first reported 
in France in 1 842 as a saprobe on dead Pinus sylvestris. It has 
subsequently been reported from many countries of the world 
on Pinus species growing in their natural environment and 
where they are propagated as non-natives in commercially 
managed plantations (Swart ef al. 1 991 , Burgess ef al. 2004). 
This fungus exists as an endophyte in healthy tree tissues, 
but causes disease when trees are stressed (Swart ef al. 
1991, Stanosz ef al. 2007, 2001). 

No sexual morph has been reported for D. sapinea 
(Smith ef al. 2000, Burgess ef al. 2004). However, results 



of a population genetics study of the fungus considering 
the lack of linkage disequilibrium amongst alleles, as well 
as the generally high genotypic diversity, proposed that a 
cryptic sexual state probably exists for this fungus (Bihon 
ef al. 2012). In support of this conclusion, a recent study of 
mating type loci showed various populations of D. sapinea 
contained the two mating type idiomorphs in more or less 
equal frequency, which is indicative of a heterothallic sexual 
cycle (Bihon ef al. 2014). 

The aim of this study was to produce a full genome 
sequence foran isolate of D. sapinea and to make this available 
for further study. Such studies could address aspects of the 
biology of the pathogen such as its selective pathogenicity 
on conifers, compared to most other Botryosphaeriales that 
infect angiosperms. 

SEQUENCED STRAINS 

USA: Wisconsin: isol. ex Pinus banksiana, June 1986, M. 
Palmer (CMW 190/CBS 117911; CBS H-21777 - dried 
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cultures). - South Africa: Kwa-Zulu Natal: 7-Oaksisol. 
ex Pinus patula, Sept. 2008, W. Bihon (CMW 39103/CBS 
□ 1 381 84; CBS H-21 778- dried cultures). 

DC 

< NUCLEOTIDE SEQUENCE ACCESSION 
NUMBER 

The Whole Genome Shotgun projects have been deposited at 
DBJ/EMBL/GenBank under the accessions AXCF00000000 
and JHUM00000000. The version described in this paper 
is version AXCF01 000000 and JHUM01 000000 for strains 
CMW 190 and CMW 39103 respectively. 



Authors: W. Bihon, M.J. Wingfield, Bernard Slippers, and 
B.D. Wingfield 

IMA Genome-F 2B 

Draft nuclear genome sequence for 
the sapstain fungus Ceratocystis 
moniliformis 

INTRODUCTION 



METHODS 

DNA from single spore cultures of two strains of Diplodia 
sapinea (CMW190/CBS1 17911 and CMW39103) was 
extracted and sequenced using lllumina: HiSeq and MiSeq 
genome analyser at Agricultural Research Council (ARC) and 
Inqaba biotech, Pretoria, South Africa, respectively. Reads 
received were subjected to the necessary sequence quality 
analysis and those of less than 30 bases were trimmed. 
Reads were assembled into a draft genome using CLC 
Genomic de novo assembler 6.0 (CLC bio, Aarhus, Denmark). 
Completeness of the genome was estimated using the Core 
Eukaryotic Genes Mapping Approach (CEGMA) analysis 
(Parra et al. 2007). Gene prediction from the genome was 
done using AUGUSTUS (Stanke ef al. 2006). 



RESULTS AND DISCUSSION 

The final assembly of isolate CMW1 90/CBS1 1 791 1 consisted 
of 2194 contigs with N50 contig size of 37659 and that of 
CMW39103 had 4102 contigs with N50 of 38230 bases. 
Maximum contig size was 265719 bases. Contigs of > 200 
bases were submitted to the genome database of NCBI. The 
output from the CEGMA (Parra ef al. 2007) pipeline analysis 
indicated that the genome sequence was estimated to be 
>95.6 % complete by mapping to the more conserved set of 
248 Core Eukaryotic Genes (CEGs). Putative gene prediction 
using AUGUSTUS (Stanke ef al. 2006) identified 13020 open 
reading frames (ORFs). 

The estimated genome size of Diplodia sapinea was 
36.97 Mb, which is smaller than the genome of most closely 
related sequenced species, Botryosphaeria dothidea (43.50 
Mb) and Neofusicoccum parvum (42.50 Mb). It contains 
fewer genes compared to B. dothidea (14999) but higher 
than that of N. parvum (10470) (http://genome.jgi-psf.org/ 
Botdo1/Botdo1. info. html; Blanco-Ulate etal. 2013). Diplodia 
sapinea has similar genome size to Fusarium graminearum 
(36.1 Mb), but with a greater number of genes (11640) 
(Cuomo ef al. 2007). The genome sequence of D. sapinea 
species has already made the characterisation of the MAT 
locus possible (Bihon etal. 2014) and access to this genome 
will no doubt facilitate further research on this important tree 
pathogen. 



The Ceratocystis moniliformis complex defines one of several 
monophyletic assemblages in the genus Ceratocystis sensu 
lato (Yuan & Mohammed 2002, van Wyk etal. 2006, Kamgan 
Nkuekam ef al. 2008, 2013, Tarigan ef al. 2010, 2011). 
Members of this complex produce hat-shaped ascospores 
from ascomata with spiny bases and have disc-like structures 
at the bases of their ascomatal necks (Hunt 1956, Upadhyay 
1981, van Wyk ef al. 2004, 2006). These fungi are relatively 
fast-growing and produce strong fruity aromas and enzymes 
that could be industrially relevant. These include invertases 
that catalyse sucrose biotransformation (van Wyk ef al. 201 3) 
and various terpenes with fruity or floral odours that are used 
for large-scale production of bioflavours (Krings & Berger 
1998, Vandamme & Soetaert 2002). 

Species in the C. moniliformis complex are found on 
the surfaces of freshly wounded woody plants, especially 
trees (Kile 1993, Roux ef al. 2004, Tarigan ef al. 2010). 
Interestingly this group of fungi are all saprobes (Kile 1993, 
Yuan & Mohammed 2002, Tarigan ef al. 201 0), unlike species 
in the C. fimbriata complex which includes serious pathogens 
of economically important plants (Roux ef al. 2000, Baker ef 
al. 2003, Barnes ef al. 2003, van Wyk ef al. 2007, Heath ef al. 
2009). In some cases, species in the C. moniliformis complex 
cause sapstain that can result in economic losses as they 
lower the value of timber (van Wyk ef al. 2006). Ceratocystis 
species are known to be transported to the wounded 
surfaces by insects such as sap-feeding beetles (Coleoptera: 
Nitidulidae) (Kirisits 2004). One species, C. bhutanensis, is 
also associated with a bark beetle (Ips schmutzenhoferi) on 
Picea spinulosa in Bhutan, but it does not appear to be a 
pathogen (van Wyk etal. 2004, Kirisits etal. 2013). 

Overall, little is known regarding the biology of species in 
the C. moniliformis complex. The availability of the nuclear 
genome sequence for one of its members, C. moniliformis 
s.str., would improve our knowledge regarding the molecular 
processes underlying their ecology and potentially inform 
industrial applications for the production of biocompounds. 
Together with the publicly available genome sequences for 
other species of Ceratocystis, particularly the sweet potato 
pathogen C. fimbriata (Wilken ef al. 2013) and the mango 
wilt pathogen C. manginecans, the C. moniliformis s.str. 
genome will also be a valuable resource for comparative 
genomics studies into the evolution and general biology of 
these important fungi. 
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SEQUENCED STRAIN 

South Africa: Mpumalanga: Sabie, isol. ex Eucalyptus 
grandis, Apr. 2002, M. van Wyk (CMW 10134, CBS118127; 
CBS H-21 775 - dried culture). 



NUCLEOTIDE SEQUENCE ACCESSION 
NUMBER 

The Whole Genome Shotgun project of the Ceratocystis 
moniliformis genome has been deposited at DDBJ/EMBL/ 
GenBank under the accession no. JMSH00000000. The 
version described in this paper is version JMSH01 000000. 



and 96.9 % for C. fimbriata s.str.}. Although these genome 
differences for C. moniliformis could be linked to its non- ^ 
pathogenic lifestyle (i.e. C. moniliformis is a saprophytic . 
fungus that occurs on a wide range of woody hosts; van Wyk — 
ef al. 2006), further research is required for determining the | — 
significance of these differences in the overall biology of this m 
group of fungi. 

Authors: M.A. van der Nest, K. Naidoo, P.M. Wilken, E. 
Rubagotti, A. Wilson, L. De Vos, E.T. Steenkamp, M.J. 
Wingfield, and B.D. Wingfield 



IMA Genome-F 2C 



METHODS 

Genomic DNA was isolated and sequenced using the 
Genome Analyzer I Ix platform (lllumina) at the Genome 
Centre, University of California at Davis (CA, USA). For this 
purpose, paired-end libraries with respective insert sizes of 
approximately 350 and 600 bases were used to produce 
reads with an average length of 100 bases. Poor-quality 
reads and/or terminal nucleotides were discarded using 
the software package CLC Genomics Workbench v. 6.0.1 
(CLCbio, Aarhus, Denmark). The remaining reads were 
assembled using Abyss v. 1.3.7 with an optimized k-mer 
size of 91 (Simpson ef al. 2009). Open reading frames 
(ORFs) were predicted using AUGUSTUS (Stanke ef al. 

2006) based on the gene models for Fusarium graminearum 
(http://bioinf.uni-greifswald.de/augustus), while genome 
completeness was evaluated using the Core Eukaryotic 
Genes Mapping Approach (CEGMA) pipeline (Parra ef al. 

2007) . 



RESULTS AND DISCUSSION 

The draft nuclear genome of Ceratocystis moniliformis has 
an estimated size of 25 429 610 bases. A value of 191 280 
was obtained for the N50 and a mean GC content of 48 %. 
The Abyss assembly generated 680 contigs, of which 365 
were retained after filtering out contigs consisting of fewer 
than 500 nucleotides. This assembly was also predicted 
to encode 6 832 ORFs at a density of 269 ORFs/Mb. A 
CEGMA completeness score of at least 96.4 % were 
obtained for this version of the assembly. 

Comparison of the C. moniliformis genome to those of 
C. fimbriata s.str. (Wilken ef al. 2013) and C. manginecans 
showed differences in several key genome statistics. The C. 
moniliformis genome is 4.0 Mb smaller than the 29.4 Mb C. 
fimbriata s.str. genome, and 6.3 Mb smaller than the 31 .7 Mb 
C. manginecans genome. Additionally, 533 and 761 fewer 
protein coding genes are predicted for the C. moniliformis 
genome than for C. fimbriata s.str. with its 7 266 predicted 
genes (Wilken ef al. 2013) and C. manginecans with its 7 
494 genes (see below), respectively. This is despite the 
fact that the three genomes are characterized by similar 
levels of completeness (i.e. 96.8 % for C. manginecans 



Draft nuclear genome sequence 
for Ceratocystis manginecans, the 
causal agent of mango wilt disease 

INTRODUCTION 

The genus Ceratocystis (Ascomycota, Microscales) 
represents an important group of plant pathogens (Roux 
& Wingfield 2009). These fungi cause diseases on a wide 
range of root and tree crops, where they are associated 
with significant economic losses (Roux & Wingfield 2009). 
The mango (Mangifera indica) wilt pathogen, Ceratocystis 
manginecans, is a particularly virulent member of this 
genus that has devastated the mango industry in Oman and 
Pakistan (Al Adawi ef al. 2006, 201 3, van Wyk ef al. 2007, Al- 
Sadi ef al. 2010). This pathogen also threatens leguminous 
trees in Oman, Pakistan and Indonesia (Poussio et al. 2010, 
Tarigan ef al. 2011, Al Adawi ef al. 2013). 

Ceratocystis manginecans is a member of the C. 
fimbriata s.lat. species complex, which is an assemblage of 
morphologically similar and phylogenetically closely related 
species (Webster & Butler 1967, van Wyk ef al. 2007, 
Wingfield ef al. 2013). In this complex, C. manginecans is 
closely related to C. acaciivora, which is responsible for 
a debilitating canker and wilt disease of plantation-grown 
Acacia mangium in Indonesia (van Wyk ef al. 2007, Tarigan 
ef al. 201 1 ). Although there is a need to refine the taxonomic 
position of some species in the complex (Wingfield ef al. 
2013), the close relationships among its members could 
indicate similar or shared mechanisms relating to their 
biology and role as pathogens. Elucidation of questions 
regarding their pathology and general biology would be 
facilitated by genome sequence comparisons (Rokas ef al. 
2003, Wall & Tonellato 2012). For this reason, the genome 
of the sweet potato pathogen, C. fimbriata was recently 
sequenced and shared publicly (Wilken ef al. 2013). In 
this study we determined the genome sequence for C. 
manginecans, which will allow for comparisons between the 
two species, advancing studies on various aspects of the 
biology of species in this complex. 
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SEQUENCED STRAIN 

Oman: Sohar area, isol. ex Prosopsis cineraria, Mar. 2005, 
A. O. AlAdawi (CBS 138185, CMW 17570; CBS H-21776 - 
dried). 



NUCLEOTIDE 
NUMBER 



SEQUENCE ACCESSION 



The Whole Genome Shotgun project of the Ceratocystis 
manginecans genome has been deposited at DDBJ/EMBL/ 
GenBank under the accession number JJRZ00000000. The 
version described in this paper is version JJRZ01 000000. 



METHODS 

All sequencing was performed on the Genome Analyzer 
I Ix platform (lllumina) at the Genome Centre, University 
of California at Davis (CA, USA). Paired-end libraries with 
respective insert sizes of approximately 350 and 600 
bases were used to produce read lengths of 100 bases. 
The software package CLC Genomics Workbench v. 6.0.1 
(CLCbio, Aarhus, Denmark) was used to discard poor-quality 
reads and/or terminal nucleotides. The remaining reads were 
assembled using the Velvet cte novo assembler (Zerbino & 
Birney 2008), with an optimised k-mer size of 71. The pre- 
assemblies were scaffolded using SSPACE v. 2.0 (Boetzer ef 
al. 2011) and the gaps were reduced using GapFiller v. 2.2.1 
(Boetzer & Pirovano 2012). Open Reading Frames (ORFs) 
were predicted using AUGUSTUS (Stanke ef al. 2006) based 
on the gene models for Fusarium graminearum (http://bioinf. 
uni-greifswald.de/augustus), while genome completeness 
was evaluated using the Core Eukaryotic Genes Mapping 
Approach (CEGMA) pipeline (Parra ef al. 2007). 



RESULTS AND DISCUSSION 

The Ceratocystsis manginecans draft genome had an 
estimated size of 31 706 104 DNA bases, a 22* average 
coverage, N50 contig size of 77 070 bases and a mean GC 
content of 47.9 %. The assembly generated 2 234 contigs, of 
which 980 were retained after filtering out contigs consisting 
of fewer than 500 nucleotides. The filtered assembly had 
a CEGMA completeness score of at least 96.4 % and was 
predicted to encoded 7 494 putative ORFs at a density of 
236 ORFs/Mb. 

The C. manginecans draft genome is similar in size than 
the genome of the sweet potato pathogen, C. fimbriata (29.4 
Mb, 7 266 ORFs) (Wilken ef al. 2013) and the wood-staining 
fungus Ophiostoma piceae (32.84 Mb, 8884 ORFs) (Haridas 
et al. 2013). However, the C. manginecans genome appears 
to be relatively small and harbours fewer genes than other 
species in Sordariomycetes. For example, the genomes of 
Podospora anserina (35.01 Mb, 10588 ORFs) (Espagne 
ef al. 2008), Fusarium fujikuroi (43.83 Mb, 14813 ORFs) 
(Wiemann ef al. 201 3) and Cryphonectria parasitica (43.9 Mb, 
11,184 ORFs) (http://genome.jgi.doe.gov/Crypa2/Crypa2. 



home.html) are much bigger in size and harbour more genes. 
The genome sequence information for C. manginecans 
will, therefore, increase our understanding of the biology, 
systematics and pathology of this group of globally important 
pathogens. 

Authors: M.A. van der Nest, K. Naidoo, P.M. Wilken, E. 
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